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METHOD AND DEVICE FOR COMPRESSED-DOMAIN VIDEO EDITING 



Cross References to Related Patent Applications 

The present patent application is related to U.S. Patent Application Serial No. 
5 10/737,184, filed December 16, 2003, assigned to the assignee of the present patent 

application. The present invention is also related to U.S. Patent Application Docket No. 
944-001-129, assigned to the assignee of the present application, filed even date herewith. 

Field of the Invention 

10 The present invention relates generally to video coding and, more particularly, to 

video editing. 

Background of the Invention 

Digital video cameras are increasingly spreading among the masses. Many of the 

15 latest mobile phones are equipped with video cameras offering users the capabilities to 
shoot video clips and send them over wireless networks. 

Digital video sequences are very large in file size. Even a short video sequence is 
composed of tens of images. As a result video is always saved and/or transferred in 
compressed form. There are several video-coding techniques, which can be used for this 

20 purpose. MPEG-4 and H.263 are the most widely used standard compression formats 
suitable for wireless cellular environments. 

To allow users to generate quality video at their terminals, it is imperative to 
provide video editing capabilities to electronic devices, such as mobile phones, 
communicators and PDAs, that are equipped with a video camera. Video editing is the 

25 process of modifying available video sequences into a new video sequence. Video 

editing tools enable users to apply a set of effects on their video clips aiming to produce a 
functionally and aesthetically better representation of their video. To apply video editing 
effects on video sequences, several commercial products exist. However, these software 
products are targeted mainly for the PC platform. 

30 Since processing power, storage and memory constraints are not an issue in the 

PC platform these days, the techniques utilized in such video-editing products operate on 
video sequences mostly in their raw formats in the spatial domain. In other words, the 
compressed video is first decoded, the editing effects are then introduced in the spatial 
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domain, and finally the video is encoded again. This is known as spatial domain video 
editing operation. 

The above scheme cannot be applied on devices, such as mobile phones, with low 
resources in processing power, storage space, available memory and battery power. 
5 Decoding a video sequence and re-encoding it are costly operations that take a long time 
and consume a lot of battery power. 

In prior art, video effects are performed in the spatial domain. More specifically, 
the video clip is first decompressed and then the video special effects are performed. 
Finally, the resulting image sequences are re-encoded. Figure 1 illustrates the general 
10 procedure in conventional video editing. The major disadvantage of this approach is that 
it is significantly computationally intensive, especially the encoding part. Such a system 
is unsuitable for a mobile platform. Because of the requirements in spatial domain 
operations, video editing systems on mobile devices are rarely used, and the available 
editing features are also very limited. 
15 It is thus advantageous and desirable to provide a method of video editing without 

the disadvantages of the prior art process. 

Summary of the Invention 

The present invention provides a method and device for compressed-domain video 
20 editing, wherein a parser is used to separate audio data from video data in a media file so 
that the audio data and video data can be edited separately. In particular, a frame analyzer 
is used to determine whether the video data are suitable for compressed domain editing or 
spatial domain processing base on the frame characteristics of the input video frames. 

Thus, the first aspect of the present invention provides a method of editing one or 
25 more input video frames in a bitstream for providing one or more edited video frames, the 
edited video frames including at least one editing effect specified by one or more editing 
parameters. The method comprises: 

identifying frame characteristics of at least one input video frame in the bitstream; 

and 

30 modifying the bitstream in the compressed domain based on the frame 

characteristics of said at least one frame and the specified editing parameters for 
providing a modified bitstream indicative of said edited video frames. 
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According to the present invention, the input video frames contain video data and 
wherein said modifying comprises modification of the video data in a compression 
domain processor for providing edited frame data. 

According to the present invention, the video data are coded with a variable- 
length code (VLC). The method further comprises: 

converting the VLC coded video data into a binary form prior to said 
modification. It is possible that the method further comprises: 

inversely quantizing the VLC coded video data prior to said converting, and 

processing the VLC coded video data in an inverse cosine transform operation 
prior to said converting. 

According to the present invention, the method further comprises: 

identifying frame characteristics of at least one further video frame in the 
bitstream; 

modifying the bitstream in a further domain different from the compressed 
domain based on the frame characteristics of said at least one further video frame and the 
specified editing parameters for providing a further modified bitstream; and 

combining at least a part of the further modified bitstream with at least a part of 
the modified bitstream. 

The further domain is a spatial domain or a file format domain. 

According to the present invention, the method further comprises: 

converting the edited frame data into an edited media file for use in a media 
player; and 

providing format information indicative of editing properties of the edited frame 
data so as to convert the edited frame data into the edited media file compatible to the 
media player. 

According to the present invention, when the bitstream also contains audio data 
separable from the video data in the input video frames, the method further comprises: 

combining the audio data with the edited frame data prior to said converting; 

modifying the audio data prior to said combining, if so desired; and 

providing timing information so as to maintain synchronization between the audio 
data and edited frame data in said combining. 

According to the present invention, the editing parameters are specified based on 
one or more editing preferences chosen by a user. 



PATENT 
944-001.128 



The second aspect of the present invention provides a media editing device for 
editing one or more input video frames in a bitstream for providing one or more edited 
video frames, the edited video frames including at least one editing effect specified by 
5 one or more editing parameters. The editing device comprises: 

a frame analyzer module, responsive to signals indicative of video frame data, for 
identifying frame characteristics of at least one input video frame in the bitstream; and 

a compressed domain processing module, responsive to signals indicative of the 
frame characteristics, for modifying the video frame data based on the frame 
10 characteristics of said at least one frame and the specified editing parameters for 
providing modified video data indicative of said edited video frames. 

According to the present invention, the frame analyzer further identifies frame 
characteristics of at least one further video frame in the bitstream. The editing device 
further comprises: 

15 a spatial domain processing module, responsive to signals indicative of the frame 

characteristics of the further video frame, for modifying video frame data in the further 
video frame based on the frame characteristics of the further video frame and the 
specified editing parameters for providing further modified video data; and 

a module for combining at least a part of the further modified video data with at 
20 least a part of the modified video data. 

According to the present invention, the editing device further comprises: 
a format composer module, responsive to signals indicative of the modified video 
data, for converting the modified video data into an edited media file for use in a media 
player, and the frame analyzer module further identifies format information indicative of 
25 editing properties of the modified video data so as to convert the modified video data into 
the edited media file compatible to the media player. 

The format composer module can be a file format composer or a media format 
composer. 

According to the present invention, when the bitstream also comprises audio data, 
30 the editing device further comprises: 

a format parser module, for separating the audio from the video frame data in the 
input video frames; 
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an audio processing module for modifying the audio data for providing modified 
audio data, if so desired; 

a combination module for combining the modified video data and the modified 
audio data for providing combined signals indicative of the combined data; and 
5 a file or media format composer, responsive to the combined signals, for 

converting the combined data into an edited media file for use in a media player. 

The third aspect of the present invention provides a communications device 
capable of editing media files for providing one or more editing effects in one or more 
10 edited video frames, the editing media files comprising one or more input video frames. 
The communications device comprises: 

a video editing application module for allowing a user to specify the editing 
effects; and 

a video editing system comprising: 
15 a compressed domain processing module, responsive to signals indicative 

of the input video frames, for modifying video frame data in one or more video 
frames based on the specified editing effects for providing modified video data 
indicative of said edited video frames; and 

a frame analyzer module, responsive to signals indicative of the video 
20 frame data, for identifying frame characteristics of at least one input video frame, 

so as to allow the compressed domain processing module to modify the video 
frame data also based on the frame characteristics. 

According to the present invention, the frame analyzer further identifies frame 
characteristics of at least one further video frame in the bitstream, and the editing system 
25 further comprises: 

a spatial domain processing module, responsive to signals indicative of the frame 
characteristics of the further video frame, for modifying video frame data in the further 
video frame based on the frame characteristics of the further video frame and the 
specified editing parameters for providing further modified video data; 
30 a module for combining at least a part of the further modified video data with at 

least a part of the modified video data; and 
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a format composer module, responsive to signals indicative of the modified video 
data, for converting the modified video data into an edited media file for use in a media 
player. 

According to the present invention, the communications device further comprises: 
5 a display screen for display video images based on modified video data. 

The communications device can be a mobile terminal, a communicator device, a 
PDA or the like. 



The fourth aspect of the present invention provides a software product for use in a 
10 video editing system for editing one or more input video frames in a bitstream for 

providing one or more edited video frames, the edited video frames including at least one 
editing effect specified by one or more editing parameters. The software product 
comprises: 

a code for identifying frame characteristics of at least one input video frame in the 
15 bitstream; and 

a code for modifying video data in one or more input video frames in the 
compressed domain based on the frame characteristics of said at least one frame and the 
specified editing parameters so as to provide a modified video data indicative of said 
edited video frames. 

20 According to the present invention, when the input video frames contain video 

data coded with variable-length code (VLC), the software product further comprises: 

a code for converting the VLC coded video data into a binary form prior to 
modification of video data in one or more input video frames. 

According to the present invention, the identifying code also identifies frame 
25 characteristics of at least one further input video frame and the software product further 
comprises: 

a code for modifying video data in one or more further input video frames in a 
further domain different from the compressed domain based on the frame characteristics 
of said further input video frame and the specified editing parameters so as to provide 
30 modified further video data. The further domain can be a spatial domain or a file format 
domain. 

According to the present invention, the software product further comprises 
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a code for combining the modified further video data with the modified video data 
for providing the edited video frames; and 

a code for converting the modified video data into an edited media file for use in a 
media player. 

5 

The fifth aspect of the present invention provides a media coding system, 
comprising: 

a media encoder for encoding media data for providing encoded media data in a 

plurality of frames having frame data; 
10 a media editing device, responsive to the encoded media data, for providing edited 

data including one or more edited frames, the edited frames having a least one editing 

effect specified by one or more editing parameters, and 

a media decoder, responsive to the edited data, for providing decoded media data, 

wherein the editing device comprises: 
15 a frame analyzer module, responsive to signals indicative of encoded data, for 

identifying frame characteristics of at least one frame in the encoded data; and 

a compressed domain processing module, responsive to signals indicative of the 

frame characteristics, for modifying the encoded frame data based on the frame 

characteristics of said at least one frame and the specified editing parameters for 
20 providing modified media data indicative of said edited media frames. 

According to the present invention, the media encoder has a connectivity 

mechanism and the editing device has a further connectivity mechanism so as to allow the 

editing device to communicate with the media decoder in order to receive therefrom 

encoded media data in a wireless fashion. 
25 According to the present invention, the media decoder has a connectivity 

mechanism and the editing device has a further connectivity mechanism so as to allow the 

editing device to provide the edited data to the media decoder in a wireless fashion. 

According to the present invention, the media encoder and the editing system are 

integrated in an expanded encoding system. 
30 According to the present invention, the media decoder has a connectivity 

mechanism and the expanded encoding system has a further connectivity mechanism so 

as to allow the expanded encoding system to provide the edited data to the media decoder 

in a wireless fashion. 
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According to the present invention, the media decoder and the editing system are 
integrated in an expanded decoding system. 

According to the present invention, the media encoder has a connectivity 
mechanism and the expanded decoding system has a further connectivity mechanism so 
5 as to allow the media encoder to provide the edited data to the expanded decoding system 
in a wireless fashion. 

According to the present invention, each of the connectivity mechanism and the 
further connectivity mechanism comprises a bluetooth connectivity module, an infra-red 
module, or a wireless LAN device. 

10 

The present invention will become apparent upon reading the description taken in 
conjunction with Figures 2-10. 

Brief Description of the Drawings 
15 Figure 1 is a block diagram illustrating the process of prior art video editing. 

Figure 2 is a schematic representation illustrating the principle of compressed- 
domain video editing, according to the present invention. 

Figure 3 is a block diagram illustrating a typical video editing system for mobile 
devices. 

20 Figure 4 is a block diagram illustrating a video editing processor system, 

according to the present invention. 

Figure 5 is a block diagram illustrating a video processor, according to the present 
invention. 

Figure 6 is a block diagram illustrating a spatial domain video processor. 
25 Figure 7 is a block diagram illustrating an audio processor. 

Figure 8 is a schematic representation illustrating a typical video sequence to be 

edited. 

Figure 9 is a schematic representation illustrating a portable device, which can 
carry out compressed-domain video editing, according to the present invention. 
30 Figure 10 is a block diagram illustrating a media coding system, which includes a 

video processor, according to the present invention. 

8 



PATENT 
944-001.128 

Detailed Description of the Invention 

The video editing procedure, according to the present invention, is based on 
compressed domain operations. As such, it reduces the use of decoding and encoding 
modules. As shown in Figure 2, the editing is carried out in a compressed domain 
5 processor. Figure 3 illustrates a typical editing system designed for a communication 
device, such as a mobile phone. This editing system can incorporate the video editing 
method and device, according to the present invention. The video editing system 10, as 
shown in Figure 3, comprises a video editing application module 12 (graphical user 
interface), which interacts with the user to exchange video editing preferences. The 

10 application uses the video editor engine 14, based on the editing preferences defined or 
selected by the user, to compute and output video editing parameters to the video editing 
process module 18. The video editing processor module 18 uses the principle of 
compressed-domain editing to perform the actual video editing operations. If the video 
editing operations are implemented in software, the video editing processor module 18 

15 can be a dynamically linked library (dll). Furthermore, the video editor engine 14 and the 
video editing processor 18 can be combined into a single module. 

A top-level block diagram of the video editing processor module 18 is shown in 
Figure 4. As shown, the editing processor module 18 takes in a media file 100, which is 
usually a video file that may have audio embedded therein. The editing process module 

20 18 performs the desired video and audio editing operations in the compressed domain, 

and outputs an edited media file 180. The video editing processor module 18 consists of 
four main units: a file format parser 20, a video processor 30, an audio processor 60, and 
a file format composer 80. 

25 A. File Format Parser: 

Media files, such as video and audio, are almost always in some standard encoded 
format, such as H.263, MPEG-4 for video and AMR-NB, CELP for audio. Moreover, the 
compressed media data is usually wrapped in a file format, such as MP4 or 3GP. The file 
format contains information about the media contents that can be effectively used to 

30 access, retrieve and process parts of the media data. The purpose of the file format parser 
is to read in individual video and audio frames, and their corresponding properties, such 
as the video frame size, its time stamp, and whether the frame is an intra frame or not. 
The file format parser 20 reads individual media frames from the media file 100 along 
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with their frame properties and feeds this information to the media processor. The video 
frame data and frame properties 120 are fed to the video processor 30 while the audio 
frame data and frame properties 122 are fed to the audio processor 60, as shown in Figure 
4. 

5 

B. Video Processor 

The video processor 30 takes in video frame data and its corresponding properties, 
along with the editing parameters (collectively denoted by reference numeral 120) to be 
applied on the media clip. The editing parameters are passed by the video editing engine 

10 14 to the video editing processor module 18 in order to indicate the editing operation to 
be performed on the media clip. The video processor 30 takes these editing parameters 
and performs the editing operation on the video frame in the compressed domain. The 
output of the video processor is the edited video frame along with the frame properties, 
which are updated to reflect the changes in the edited video frame. The details of the 

15 video processor 30 are shown in Figure 5. As shown, the video processor 30 consists of 
the following modules: 

B.l. Frame Analyzer 

The main function of the Frame Analyzer 32 is to look at the properties of the 

20 frame and determine the type of processing to be applied on it. Different frames of a 

video clip may undergo different types of processing, depending on the frame properties * 
and the editing parameters. The Frame Analyzer makes the crucial decision of the type of 
processing to be applied on the particular frame. A typical video bitstream is shown in 
Figure 8. Different parts of the bitstream will be acted upon in different ways, depending 

25 on the frame characteristics of the bitstream and the specified editing parameters. As 
shown in Figure 8, some portions of the bitstream are not included in the output movie, 
and will be thrown away. Some will be thrown away only after being decoded. Others 
will be re-encoded to convert from P- to I- frame. Some will be edited in the compressed 
domain and added to the output movie, while still others will be simply copied to the 

30 movie without any changes. It is the job of the Frame Analyzer to perform all these 
crucial decisions. 
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B.2. Compressed Domain Processor 

The core processing of the frame in the compressed domain is performed in the 
compressed domain processor 34. The compressed video data is changed to apply the 
desired editing effect. This module can perform various different kinds of operations on 
5 the compressed data. One of the common ones among them is the application of the Black 
& White effect where a color frame is changed to a black & white frame by removing the 
chrominance data from the compressed video data. Other effects that can be performed by 
this module are the special effects (such as color filtering, sepia, etc.) and the transitional 
effects (such as fading in and fading out, etc.) Note that the module is not limited only to 

10 these effects, but can be used to perform all possible kinds of compressed domain editing. 
Video data is usually VLC (variable-length code) coded. Hence, in order to 
perform the editing in the compressed domain, the data is first VLC decoded so that data 
can be represented in regular binary form. The binary data is then edited according to the 
desired effect, and the edited binary data is then VLC coded again to bring it back to 

15 compliant compressed form. Furthermore, some editing effects may require more than 
VLC decoding. For example, the data is first subjected to inverse quantization and/or 
IDCT (inverse discrete cosine transform) and then edited. The edited data is re-quantized 
and/or subjected to DCT operations to compliant compressed form. 

20 B.3. Decoder 

Although the present invention is concerned with compressed domain processing, 
there is still a need to decode frames. As shown in Figure 5, the video processor 30 
comprises a decoder 36, operatively connected to the frame analyzer 32 and the 
compressed domain processor 34, possibly via an encoder 38. Take the video bitstream 

25 shown in Figure 8 as an example, if the beginning cut point in the input video falls on a P- 
frame, then this frame simply cannot be included in the output movie as a P-frame. The 
first frame of a video sequence must always start with an I- frame. Hence, there is a need 
to convert this P-frame to an I-frame. 

In order to convert the P-frame to an I-frame, the frame must first be decoded. 

30 Moreover, since it is a P-frame, the decoding must start all the way back to the first I- 
frame preceding the beginning cut point. Hence, the relevant decoder is required to 
decode the frames by the decoder 36 from the preceding I-frame to the first included 
frame. This frame is then sent to the encoder 38 for re-encoding. 

11 
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B.4. Spatial Domain Processor 

It is possible to incorporate a spatial domain processor 50 in the compressed 
domain editing system, according to the present invention. The spatial domain processor 
5 50 is used mainly in the situation where compressed domain processing of a particular 
frame is not possible. There may be some effects, special or transitional, that are not 
possible to apply directly to the compressed binary data. In such a situation, the frame is 
decoded and the effects are applied in the spatial domain. The edited frame is then sent to 
the encoder for re-encoding. 
10 The Spatial Domain Processor 50 can be decomposed into two distinct modules, 

as shown in Figure 6. The Special Effects Processor 52 is used to apply special effects on 
the frame (such as Old Movie effect, etc.). The Transitional Effects Processor 54 is used 
to apply transitional effects on the frame (such as Slicing transitional effect, etc). 

15 B.5. Encoder 

If a frame is to be converted from P- to I- frame, or if some effect is to be applied 
on the frame in the spatial domain, then the frame is decoded by the decoder and the 
optional effect is applied in the spatial domain. The edited raw video frame is then sent to 
the encoder 38 where it is compressed back to the required type of frame (P- or I-), as 

20 shown in Figure 5. 

B.6. Pre-Composer 

The main function of the Pre-Composer 40 as shown in Figure 5 is to update the 
properties of the edited frame so that it is ready to be composed by the File Format 

25 Composer 80 (Figure 4). 

When a frame is edited in the compressed domain, the size of the frame changes. 
Moreover, the time duration and the time stamp of the frame may change. For example, if 
slow motion is applied on the video sequence, the time duration of the frame, as well as 
its time stamp, will change. Likewise, if the frame belongs to a video clip that is not the 

30 first video clip in the output movie, then the time stamp of the frame will be translated to 
adjust for the times of the first video clip, even though the individual time duration of the 
frame will not change. 

12 
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If the frame is converted from a P-frame to an I- frame, then the type of the frame 
changes from inter to intra. Also, whenever a frame is decoded and re-encoded, it will 
likely cause a change in the coded size of the frame. All of these changes in the properties 
of the edited frame must be updated and reflected properly. The composer uses these 
frame properties to compose the output movie in the relevant file format. If the frame 
properties are not updated correctly, the movie cannot be composed. 

C. Audio Processor 

Video clips usually have audio embedded inside them. The audio processor 60, as 
shown in Figures 4 and 7, is used to process the audio data in the input video clips in 
accordance with the editing parameters to generate the desired audio effect in the output 
movie. 

There can be many different kinds of audio operations in the editing system, as 
shown in Figure 7. The most common among these operations are: retaining original 
audio, replacing new audio and muting audio, for example. Upon receiving the audio 
frame data and audio frame information 121, including the desired audio effect specified 
by the editing parameters, from the file format parser 20, an information processor 62 
finds out what kinds of audio operations are specified and sends the different data in the 
audio frame data to different audio processing modules for processing. 

C.l. Retain Original Audio 

The most common case in audio data processing in the audio processor is to retain 
the original audio in the edited video clip. In this case, the necessary video frames are 
extracted from the video clip 162 a and included in the output edited clip 164 by a frame 
extractor module 64. It is crucial that proper audio/video synchronization must be 
maintained when including original audio. A video clip may be cut from any arbitrary 
point. The cut points of the video and audio must match exactly in order to avoid any 
audio drift in the edited video clip. For that matter, timing information 132 about the 
video is supplied to the audio processor for synchronization. With a compressed-domain 
audio processor 65, it is possible to process the audio frame 164 in the compressed- 
domain. For example, if the processor 65 includes various sub-modules and software 
programs, various compressed-domain operations such as audio fading, audio filtering, 
audio mixing, special audio effects and the like can be achieved. 

13 



PATENT 
944-001.128 



C.2. Replace New Audio 

It is also possible for the audio processor to include audio from another source and 
replace the original audio in the video clip with the new audio sample. Also, it is possible 
5 to insert this new audio sample at any point in the output movie and for any duration of 
the output movie. If the new audio sample has a shorter duration than the duration to 
insert, then the audio processor is able to loop the audio so that it plays back repeatedly 
for the total duration of the audio insertion. For audio data replacement purposes, a frame 
extractor 68 (which could be the same extractor 64) operatively connected to an audio 

10 source 67 to obtain a new audio sample 167 and output the new audio sample as new 

audio frames 168 at proper timing. With a compressed-domain audio processor 69, it is 
possible to process the audio frame 168 in the compressed-domain. For example, if the 
processor 69 includes various sub-modules and software programs, various compressed- 
domain operations such as audio fading, audio filtering, audio mixing, special audio 

1 5 effects and the like can be achieved. 

C.3. Mute Audio 

The audio processor is also able to mute the original audio for any duration of the 
output movie, so that the edited movie does not have any audio for the duration of the 
20 mute. There are different ways of muting audio in the movie. It is possible that the audio 
processor simply does not provide any audio frames for the particular duration when 
audio is to be muted. Alternatively, a silent frame generator 66 is used to insert "silent" 
audio frames 166 into the audio frame data such that, when played back, the audio frames 
give the effect of silence or mute in the output movie. 

25 

The output from various audio processing modules, such as the frame extractors 
64, 68 and the silent frame generator 66, are combined in an audio frame combination 
module 70 for providing the processed audio frames 170. The output 170 from the audio 
frame combination module 70 can further be subjected to compressed-domain audio 
30 processing by which the inserted audio frames are edited in the compressed domain to 

change their contents by a compressed domain audio processor 71. The audio processor 
71 can be used in addition to the audio processors 65 and 67, or instead of the audio 
processors 65 and 67. 

14 
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It should be noted that audio processing is not limited to these three operations 
only. There can be any number of various audio processing capabilities included in the 
audio processor, such as audio mixing, multiple audio channel support, etc. The above 
discussion is for illustrative purposes only. 

5 

Audio frames are generally shorter in duration than their corresponding video 
frames. Hence, more than one audio frame is generally included in the output movie for 
every video frame. Therefore, an adder is needed in the audio processor to gather all the 
audio frames corresponding to the particular video frame in the correct timing order. The 
10 processed audio frames are then sent to the composer for composing them in the output 
movie. 



D. File Format Composer 

Once the media frames (video, audio, etc.) have been edited and processed, they 

15 are sent to the File Format Composer 80, as shown in Figure 4. The composer 80 

receives the edited video 130 and audio frames 160, along with their respective frame 
properties, such as frame size, frame timestamps, frame type (e.g., P- or etc. It then 
uses this frame information to compose and wrap the media frame data in the proper file 
format and with the proper video and audio timing information. The result is the final 

20 edited media file 180 in the relevant file format, playable in any compliant media player. 

The present invention, as described above, provides the advantage that the video 
editing operations can be implemented in a small portable devices, such as a mobile 
phone, a communicator, a personal digital assistant (PDA) that is equipped with a video 

25 camera or capable of receiving video data from an external source. Figure 9 is a 

schematic representation of a portable device, which can be used for compressed-domain 
video editing, according to the present invention. As shown in Figure 9, the portable 
device 1 comprises a display 5, which can be used to display a video image, for example. 
The device 1 also comprises a video editing system 10, including a video editing 

30 application 12, a video editing engine 12 and a video editing processor 18 as shown in 

Figure 3. The video editing processor 18 receives input media file 100 from a media file 
source 210 and conveyed the output media file 180 to a media file receiver 220. The 
media file source 210 can be a video camera, which can be a part of the portable device 1 . 

15 
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However, the media file source 210 can be a video receiver operatively connected to a 
video camera. The video receiver can be a part of the portable device. Furthermore, the 
media file source 210 can be a bitstream receiver, which is a part of the portable device, 
for receiving a bitstream indicative of the input media file. The edited media file 180 can 
5 be displayed on the display 5 of the portable device 1. However, the edited media file 
180 can be conveyed to the media file receiver, such as a storage medium, a video 
transmitter. The storage medium and the video transmitter can also be part of the portable 
device. Moreover, the media file receiver 220 can also be an external display device. It 
should be noted the portable device 1 also comprises a software program 7 to carry out 

10 many of the compressed-domain editing procedures as described in conjunction with 
Figures 4, 5 and 7. For example, the software program 7 can be used for file format 
parsing, file format composing, frame analysis and compressed domain frame processing. 

It should be noted that, the compressed domain video editing processor 18 of the 
present invention can be incorporated into a video coding system as shown in Figure 10. 

15 As shown in Figure 10, the coding system 300 comprises a video encoder 310, a video 

decoder 330 and a video editing system 2. The editing system 2 can be incorporated in a 
separate electronic device, such as the portable device 1 in Figure 9. However, the 
editing system 2 can also be incorporated in a distributed coding system. For example, 
the editing system 2 can be implemented in an expanded decoder 360, along with the 

20 video decoder 330, so as to provide decoded video data 190 for displaying on a display 
device 332. Alternatively, the editing system 2 is implemented in an expanded encoder 
350, along with the video encoder 310, so as to provide edited video data to a separate 
video decoder 330. The edited video data can also be conveyed to a transmitter 320 for 
transmission, or to a storage device 340 for storage. 

25 Some or all of the components 2, 310, 320, 330, 332, 340, 350, 360 can be 

operatively connected to a connectivity controller 356 (or 356% 356") so that they can 
operate as remote-operable devices in one of many different ways, such as bluetooth, 
infra-red, wireless LAN. For example, the expanded encoder 350 can communicate with 
the video decoder 330 via wireless connection. Likewise, the editing system 2 can 

30 separately communicate with the video encoder 310 to receive data therefrom and with 
the video decoder 330 to provide data thereto. 

Thus, although the invention has been described with respect to one or more 
embodiments thereof, it will be understood by those skilled in the art that the foregoing 
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and various other changes, omissions and deviations in the form and detail thereof may be 
made without departing from the scope of this invention. 
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